=============== <Original Dataset> =============== <class 'pandas.core.frame.DataFrame'> RangeIndex: 20640 entries, 0 to 20639 Data columns (total 10 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 longitude 20640 non-null float64 1 latitude 20640 non-null float64 2 housing_median_age 20640 non-null float64 3 total_rooms 20640 non-null float64 4 total_bedrooms 20433 non-null float64 5 population 20640 non-null float64 6 households 20640 non-null float64 7 median_income 20640 non-null float64 8 median_house_value 20640 non-null float64 9 ocean_proximity 20640 non-null object dtypes: float64(9), object(1) memory usage: 1.6+ MB None
| longitude | latitude | housing_median_age | total_rooms | total_bedrooms | population | households | median_income | median_house_value | ocean_proximity | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -122.23 | 37.88 | 41.0 | 880.0 | 129.0 | 322.0 | 126.0 | 8.3252 | 452600.0 | NEAR BAY |
| 1 | -122.22 | 37.86 | 21.0 | 7099.0 | 1106.0 | 2401.0 | 1138.0 | 8.3014 | 358500.0 | NEAR BAY |
| 2 | -122.24 | 37.85 | 52.0 | 1467.0 | 190.0 | 496.0 | 177.0 | 7.2574 | 352100.0 | NEAR BAY |
| 3 | -122.25 | 37.85 | 52.0 | 1274.0 | 235.0 | 558.0 | 219.0 | 5.6431 | 341300.0 | NEAR BAY |
| 4 | -122.25 | 37.85 | 52.0 | 1627.0 | 280.0 | 565.0 | 259.0 | 3.8462 | 342200.0 | NEAR BAY |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 20635 | -121.09 | 39.48 | 25.0 | 1665.0 | 374.0 | 845.0 | 330.0 | 1.5603 | 78100.0 | INLAND |
| 20636 | -121.21 | 39.49 | 18.0 | 697.0 | 150.0 | 356.0 | 114.0 | 2.5568 | 77100.0 | INLAND |
| 20637 | -121.22 | 39.43 | 17.0 | 2254.0 | 485.0 | 1007.0 | 433.0 | 1.7000 | 92300.0 | INLAND |
| 20638 | -121.32 | 39.43 | 18.0 | 1860.0 | 409.0 | 741.0 | 349.0 | 1.8672 | 84700.0 | INLAND |
| 20639 | -121.24 | 39.37 | 16.0 | 2785.0 | 616.0 | 1387.0 | 530.0 | 2.3886 | 89400.0 | INLAND |
20640 rows × 10 columns
=============== <Modified Dataset> =============== <class 'pandas.core.frame.DataFrame'> RangeIndex: 20433 entries, 0 to 20432 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 longitude 20433 non-null float64 1 latitude 20433 non-null float64 2 housing_median_age 20433 non-null float64 3 total_rooms 20433 non-null float64 4 total_bedrooms 20433 non-null float64 5 population 20433 non-null float64 6 households 20433 non-null float64 7 median_income 20433 non-null float64 8 ocean_proximity 20433 non-null object dtypes: float64(8), object(1) memory usage: 1.4+ MB None
| longitude | latitude | housing_median_age | total_rooms | total_bedrooms | population | households | median_income | ocean_proximity | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | -122.23 | 37.88 | 41.0 | 880.0 | 129.0 | 322.0 | 126.0 | 8.3252 | NEAR BAY |
| 1 | -122.22 | 37.86 | 21.0 | 7099.0 | 1106.0 | 2401.0 | 1138.0 | 8.3014 | NEAR BAY |
| 2 | -122.24 | 37.85 | 52.0 | 1467.0 | 190.0 | 496.0 | 177.0 | 7.2574 | NEAR BAY |
| 3 | -122.25 | 37.85 | 52.0 | 1274.0 | 235.0 | 558.0 | 219.0 | 5.6431 | NEAR BAY |
| 4 | -122.25 | 37.85 | 52.0 | 1627.0 | 280.0 | 565.0 | 259.0 | 3.8462 | NEAR BAY |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 20428 | -121.09 | 39.48 | 25.0 | 1665.0 | 374.0 | 845.0 | 330.0 | 1.5603 | INLAND |
| 20429 | -121.21 | 39.49 | 18.0 | 697.0 | 150.0 | 356.0 | 114.0 | 2.5568 | INLAND |
| 20430 | -121.22 | 39.43 | 17.0 | 2254.0 | 485.0 | 1007.0 | 433.0 | 1.7000 | INLAND |
| 20431 | -121.32 | 39.43 | 18.0 | 1860.0 | 409.0 | 741.0 | 349.0 | 1.8672 | INLAND |
| 20432 | -121.24 | 39.37 | 16.0 | 2785.0 | 616.0 | 1387.0 | 530.0 | 2.3886 | INLAND |
20433 rows × 9 columns
=============== AutoML Start =============== =============== Model : KMeans =============== Start calculating silhouette_score...( method = KMeans ) Calculating silhouette_score ( k = 2 ) Calculating silhouette_score ( k = 3 ) Calculating silhouette_score ( k = 4 ) Calculating silhouette_score ( k = 5 ) Calculating silhouette_score ( k = 6 ) Calculating silhouette_score ( k = 7 ) Calculating silhouette_score ( k = 8 ) Calculating silhouette_score ( k = 9 ) Calculating silhouette_score ( k = 10 ) Calculating silhouette_score ( k = 11 ) Calculating silhouette_score ( k = 12 ) Calculating silhouette_score ( k = 2 ) Calculating silhouette_score ( k = 3 ) Calculating silhouette_score ( k = 4 ) Calculating silhouette_score ( k = 5 ) Calculating silhouette_score ( k = 6 ) Calculating silhouette_score ( k = 7 ) Calculating silhouette_score ( k = 8 ) Calculating silhouette_score ( k = 9 ) Calculating silhouette_score ( k = 10 ) Calculating silhouette_score ( k = 11 ) Calculating silhouette_score ( k = 12 )
best K_s = [2, 5] max_iter = 100 / algorithm = full / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 18112 1.0 2321 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 179200.0 1.0 185400.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 14999.0 1.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 206719.410004 1.0 211908.015941 Name: median_house_value, dtype: float64 max_iter = 100 / algorithm = elkan / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 18112 1.0 2321 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 179200.0 1.0 185400.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 14999.0 1.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 206719.410004 1.0 211908.015941 Name: median_house_value, dtype: float64 max_iter = 300 / algorithm = full / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 18112 1.0 2321 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 179200.0 1.0 185400.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 14999.0 1.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 206719.410004 1.0 211908.015941 Name: median_house_value, dtype: float64 max_iter = 300 / algorithm = elkan / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 18112 1.0 2321 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 179200.0 1.0 185400.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 14999.0 1.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 206719.410004 1.0 211908.015941 Name: median_house_value, dtype: float64 max_iter = 500 / algorithm = full / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 18112 1.0 2321 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 179200.0 1.0 185400.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 14999.0 1.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 206719.410004 1.0 211908.015941 Name: median_house_value, dtype: float64 max_iter = 500 / algorithm = elkan / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 18112 1.0 2321 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 179200.0 1.0 185400.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 14999.0 1.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 206719.410004 1.0 211908.015941 Name: median_house_value, dtype: float64 max_iter = 100 / algorithm = full / k = 5 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 1703 1.0 7544 2.0 8334 3.0 276 4.0 2576 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 2.0 500001.0 3.0 500001.0 4.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 166300.0 1.0 175000.0 2.0 184200.0 3.0 180200.0 4.0 183800.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 22500.0 1.0 14999.0 2.0 14999.0 3.0 47500.0 4.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 197965.870229 1.0 200971.803552 2.0 214296.768778 3.0 207263.068841 4.0 209440.767857 Name: median_house_value, dtype: float64 max_iter = 100 / algorithm = elkan / k = 5 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 1703 1.0 7544 2.0 8334 3.0 276 4.0 2576 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 2.0 500001.0 3.0 500001.0 4.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 166300.0 1.0 175000.0 2.0 184200.0 3.0 180200.0 4.0 183800.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 22500.0 1.0 14999.0 2.0 14999.0 3.0 47500.0 4.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 197965.870229 1.0 200971.803552 2.0 214296.768778 3.0 207263.068841 4.0 209440.767857 Name: median_house_value, dtype: float64 max_iter = 300 / algorithm = full / k = 5 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 1703 1.0 7544 2.0 8334 3.0 276 4.0 2576 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 2.0 500001.0 3.0 500001.0 4.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 166300.0 1.0 175000.0 2.0 184200.0 3.0 180200.0 4.0 183800.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 22500.0 1.0 14999.0 2.0 14999.0 3.0 47500.0 4.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 197965.870229 1.0 200971.803552 2.0 214296.768778 3.0 207263.068841 4.0 209440.767857 Name: median_house_value, dtype: float64 max_iter = 300 / algorithm = elkan / k = 5 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 1703 1.0 7544 2.0 8334 3.0 276 4.0 2576 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 2.0 500001.0 3.0 500001.0 4.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 166300.0 1.0 175000.0 2.0 184200.0 3.0 180200.0 4.0 183800.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 22500.0 1.0 14999.0 2.0 14999.0 3.0 47500.0 4.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 197965.870229 1.0 200971.803552 2.0 214296.768778 3.0 207263.068841 4.0 209440.767857 Name: median_house_value, dtype: float64 max_iter = 500 / algorithm = full / k = 5 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 1703 1.0 7544 2.0 8334 3.0 276 4.0 2576 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 2.0 500001.0 3.0 500001.0 4.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 166300.0 1.0 175000.0 2.0 184200.0 3.0 180200.0 4.0 183800.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 22500.0 1.0 14999.0 2.0 14999.0 3.0 47500.0 4.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 197965.870229 1.0 200971.803552 2.0 214296.768778 3.0 207263.068841 4.0 209440.767857 Name: median_house_value, dtype: float64 max_iter = 500 / algorithm = elkan / k = 5 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ========== ===count=== predict 0.0 1703 1.0 7544 2.0 8334 3.0 276 4.0 2576 Name: median_house_value, dtype: int64 ===max=== predict 0.0 500001.0 1.0 500001.0 2.0 500001.0 3.0 500001.0 4.0 500001.0 Name: median_house_value, dtype: float64 ===median=== predict 0.0 166300.0 1.0 175000.0 2.0 184200.0 3.0 180200.0 4.0 183800.0 Name: median_house_value, dtype: float64 ===min=== predict 0.0 22500.0 1.0 14999.0 2.0 14999.0 3.0 47500.0 4.0 22500.0 Name: median_house_value, dtype: float64 ===mean=== predict 0.0 197965.870229 1.0 200971.803552 2.0 214296.768778 3.0 207263.068841 4.0 209440.767857 Name: median_house_value, dtype: float64